An Approach for Arabic Root Generating and Lexicon Development

نویسنده

  • Mohamed Osman Hegazi
چکیده

This paper presents a novel approach for Arabic root generation and lexicon development. The approach provides three algorithms; in the first algorithm Arabic word root is generated using the concept of permutation and combination, the root generator algorithm generates roots by applying permutations to the Arabic alphabetic letters. Then, the second algorithm is used for developing difference words from that root using Arabic morphology template, the morphology developing algorithm develops the Arabic word by formulating the roots according to the Arabic template. Finally, lexicon is constructed by providing meaning and other information. The paper provides a contribution to the field of Natural Language Processing (NLP); and hence it provides a novel Arabic NLP approach that might be used in different ways in generating Arabic roots and developing Arabic lexicons. The approach presented in this paper can be used as a basis for processing, understanding and electronically using of Arabic language, because it provides a way for covering most of Arabic language components. The results show that this approach provides up to 21,924 different Arabic roots with their morphological information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Induction of Root and Pattern Lexicon for Unsupervised Morphological Analysis of Arabic

We propose an unsupervised approach to learning non-concatenative morphology, which we apply to induce a lexicon of Arabic roots and pattern templates. The approach is based on the idea that roots and patterns may be revealed through mutually recursive scoring based on hypothesized pattern and root frequencies. After a further iterative refinement stage, morphological analysis with the induced ...

متن کامل

Unsupervised Induction of Arabic Root and Pattern Lexicons using Machine Learning

We describe an approach to building a morphological analyser of Arabic by inducing a lexicon of root and pattern templates from an unannotated corpus. Using maximum entropy modelling, we capture orthographic features from surface words, and cluster the words based on the similarity of their possible roots or patterns. From these clusters, we extract root and pattern lexicons, which allows us to...

متن کامل

Imam Sadegh’s (AS) Hadiths in Sunni’s lexicon

The Quran and Hadiths including Infallibles (AS) Hadiths such as Imam Sadegh (AS) were one of compilation references, and also, one of the fields of research for Arabs morphologists from long time ago. Imam Sadegh’s (AS) Hadiths based on Sunni’s lexicon, and then, based on another Islamic science books will be illustrated in this research in order to identify where these Hadiths hav...

متن کامل

The Effects of Factorizing Root and Pattern Mapping in Translating between Tunisian Arabic and Standard Arabic

The development of natural language processing tools for dialects faces the severe problem of lack of resources. In cases of diglossia, as in Arabic, one variant, Modern Standard Arabic (MSA), has many resources that can be used to build natural language processing tools. Whereas other variants, Arabic dialects, are resource poor. Taking advantage of the closeness of MSA and its dialects, one w...

متن کامل

Automatic Verification and Augmentation of Multilingual Lexicons

We present an approach for automatic verification and augmentation of multilingual lexica. We exploit existing parallel and monolingual corpora to extract multilingual correspondents via triangulation. We demonstrate the efficacy of our approach on two publicly available resources: Tharwa, a three-way lexicon comprising Dialectal Arabic, Modern Standard Arabic and English lemmas among other inf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016